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(54) Speech encoder wltti features extracted from current and previous frames 



(57) In a speech signal encoder device comprising 
a frame divider (31) for producing original speech 
frames, a mode decision circuit (49) decides a predeter- 
mined number of modes by using feature quantities 
which are extracted from each current speech frame 
segmented from an input speech signal at a predeter- 
mined frame period of as short as 5 ms and from a pre- 
vious speech frame segmented at least one frame 
period prior to the cun-ent speech frame. Preferably a 
weighing circuit (47) provides the current speech frame 
by perceptually weighting the original speech frames 



into weighed sneech frames. It Is possible to provide the 
feature quantities by a primary quantity and as a sec- 
ondary quantity by a rate of variation in the primary 
quantity. Each feature quantity is preferably adjusted 
into an adjusted quantity in response to each current 
mode decided by using the current speech frame and a 
previous mode decided at least one frame period prior 
to the current mode. Each feature quantity may be a 
pitch prediction gain, a short-period predicted gain, a 
level, or a pitch of each original speech frame. 
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Description 

This invention relates to a speech encoder device for encoding a speech or voice signal at a short frame period into 
encoder output codes having a high code quality. 

5 A speech encoder device of this type is descrSDed as a speech codec in a paper contributed by Kazunori Ozawa 

and five others including the present sole inventor to the lElCE Trans. Commun. Volume E77-B, No. 9 (September 
1 994). pages 111 4 to 11 21 . under the title of "M-LCELP Speech Coding at 4 kb/s with Multi-Mode and Multi-Codebook". 
According to this Ozawa et al paper, an input speech signal is encoded as follows. 

The input speech signal is segmented or divided into original speech frames, each typically having a frame period 

10 or length of 40 ms. By LPC (linear predictive coding), extracted from the speech frames are spectral parameters repre- 
sentative of spectral characteristics of the speech signal. Before so calculating feature or characteristic quantities, it is 
preferred to convert the original speech frames to weighted ^eech frames by using a perceptual or auditory weight. 
The feature quantities are used in deciding modes of segments, such as vowel and consonant segments, to produce 
decided mode results indicative of the modes. 

75 in an encoding part of this Ozawa et al encoder device, each original frame is subdivided into original subf rame sig- 
nals, each being typically 8 ms long. Such speech subframes are used in deciding excitation signals. In accordance with 
the modes, adaptive parameters (delay parameters corresponding to pitch periods and gain parameters) are extracted 
from an adaptive codebook for each current speech subframe based on a previous excitation signal. In this manner, the 
adaptive codebook is used in extracting pitches of the speech subframes with prediction. For a residual signal obtained 

20 by pitch prediction, an optimal excitation code vector is selected from a speech codebook (vector quantization code- 
book) composed of noise signals of a predetermined kind. The excitation signals are quantized by calculating an opti- 
mal gain. 

The excitation code vector is selected so as to minimize an error power between the residued signal and a signal 
composed of selected noise signal. EKher for transmission to a speech decoder device or storage in a recording device 
25 for later reproduction, a multiplexer is used to produce an encoder device output signal into which multiplexed are the 
mode results and indexes indicative of the adaptive parameters including the gain parameters and the kind of optimal 
excitation code vectors. 

In a conventional speech encoder device of Ozawa et aL it is necessary on reducing a processing delay to use a 
short frame period for the original or the weighted speech frames. The feature quantities are subjected to considerable 
30 fluctuations with time when the frame period is 5 ms or shorter. The fluctuations give rise to unstable and erroneous 
Interswitching of the modes and therefore in a deteriorated code quality. 

Moreover, selected modes, predicted pitches, and extracted levels are subjected to appreciable fluctuations when 
the frame period is 5 ms or shorter. The appreciable fluctuations give rise, not only to the unstable and erroneous inter- 
switching. but also to unstable and erroneous pitch extraction and level extraction and accordingly to a deteriorated 
35 code quality. 

When the levels of the input speech signal are used on encoding the input speech signal, indexes indicative of the 
levels are additionally used in the encoder device output signal. When the pitches are used, the encoder device output 
signal need not include the indexes indicative of the pitches. 

In view of the foregoing, it is an object of the present invention to provide a speech encoder device operable wrtii a 
40 short processing delay even when an input speech signal is segmented into original speech frames of a short frame 
period, such as 5 to 10 ms long or shorter. 

It is another object of this invention to provide a speech encoder device which is of the type described and which 
can prevent feature quantities from being subjected to appreciable fluctuations with time. 

It is still another object of tiiis invention to provide a speech encoder device which is of the tpye described and which 
45 can exactly decide modes for the original frames or for weighted frames. 

It is yet another object of this invention to provide a speech encoder device which is of the type described and which 
can exactly extract pitches from speech subframes. 

It is a further object of this invention to provide a speech encoder device which is of the type described to produce 
encoder output codes of a high code quality. 
so Other objects of this invention will become clear as the description proceeds. 

In accordance with an aspect of this invention, there is provided a speech signal encoder device conrprising (a) 
segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period, 
(b) deciding means for using the original speech frames in deciding a predetermined number of modes of the original 
speech frames to produce decided nriode results, and (c) encoding means for encoding the input speech signal into 
55 codes at tiie frame period and in response to the modes to produce tiie decided mode results and the codes as an 
encoder device output signal, wherein the deciding means decides the modes by using feature quantities of each cur- 
rem speech frame segmented from the input speech signal at the frame period and a previous speech frame seg- 
mented at least one frame period prior to the cunent speech frame. 
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In accordance with another aspect of this invention, there Is provided a speech signal encoder device comprising 
(a) segmenting means for segmenting an input speech signal Into original speech frames at a predetennined frame 
period, (b) extracting means for using the original speech frames in extracting pitches from the input speech signal and 
(c) encoding means for encoding the input speech signal at the frame period and in response to the pitches into codes 

5 for use as an encoder dewce output signal, wherein the extracting means extracts the pitches by using each current 
speech frame segmented from the input speech signal at the frame period and a previous ^eech frame segmented at 
least one frame period prior to the cun*ent speech frame. 

In accordance with a different aspect of this invention, there is provided a speech signal encoder device comprising 
(a) segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame 

10 period, (b) deciding means for using the original speech frames In deciding a predetermined number of modes of the 
original speech frames to produce decided mode results, and (c) encoding means for encoding the Input speech signal 
into codes at the frame period and in response to the modes to produce the decided mode results and the codes as an 
encoder device output signal, wherein the dedding means makes use. in deciding a current mode of the modes for 
each cun-ent speech frame segmented from the Input speech signal at the frame period, of feature quantities of at least 

IS one kind extracted from the current speech frame and a previous speech frame segmented at least one frame period 
prior to the cun-ent speech frame and of a previous mode decided at least one frame period prior to the cun-ent mode. 

In accordance with another different aspect of this invention, there Is provided a speech signal encoder device com- 
prising (a) segmenting means for segmenting an input speech signal into original speech frames at a predetermined 
frame period, (b) deciding means for using the original speech frames in deciding a predetermined number of modes 

20 of the original speech frames to produce decided mode results, (c) extracting means for extracting pitches from the 
input speech signal, and (d) encoding means for encoding the input speech signal into codes at the frame period and 
In response to the modes to produce the decided mode results and the codes as an encoder device output signal, 
wherein: (A) the extracting means comprises: (A1) feature quantity extracting means for extracting feature quantities by 
using at least each current speech frame segmented from the input speech signal at the frame period; and {A2) feature 

25 quantity adjusting means for using the feature quantities as the pitches to adjust the pitches into adjusted pitches in 
response to each cun-ent mode decided for the current speech frame and a previous mode decided at least one frame 
period prior to the current mode; (B) the encoding means encoding the input speech signal into the codes in response 
furtiier to the adjusted pltdiea 

In accordance with still another different aspect of this invention, there is provided a speech signal encoder device 

30 comprising (a) segmenting means for segmenting an Input speech signal into original speech frames at a predeter- 
mined frame period, (b) deciding means for using tiie original speech frames in deciding a predetermined number of 
modes of the original speech frames to produce decided mode results, (c) extracting means for extracting levels from 
the input speech signal, and (d) encoding means for encoding tiie input speech signal into codes at ttie frame period 
and in response to the modes to produce the decided mode results and the codes as an encoder device output signal, 

35 wherein: (A) the extracting means comprises: (A1) feature quantity extracting means for exfracting feature quantities by 
using at least each current speech frame segmented from tiie input speech signal at the frame period; and (A2) feature 
quantity adjusting means for using the feature quantities as the levels to adjust tiie levels into adjusted levels in 
response to each cunrent mode decided for the current speech frame and a previous mode decided at least one frame 
period prior to the current mode; (B) the encoding means encoding the input speech signal into the codes in response 

40 further to the adjusted levels. 

Fig. 1 is a block diagram of a speech signal encoder device according to a first embodiment of the instant invention; 
Fig. 2 is a block diagram of a mode decision circuit used in the speech signal encoder device illustrated In Rg. 1 ; 
Fig. 3 is a block diagram of another mode decision circuit for use in a speech signal encoder device according to a 
45 second embodiment of this invention; 

Fig. 4 is a block diagram of a pitch extracting circuit for use in a speech encoder device according to a third embod- 
iment of this invention; 

Fig. 5 is a block diagrnm of a speech signal encoder device according to a fourth embodiment of this invention; 
Fig. 6 is a block diagram of a speech signal encoder device according to a fifth embodiment of this invention; 
50 Fig. 7 is a block diagram of a mode decision circuit used in the speech signal encoder device illusti-ated In Fig. 6; 
Fig. 8 is a block diagram of another mode dedsion drcuit for use in tiie speech signal encoder device shown in Fig 
6: 

Fig. 9 shows in blocks a feature quantity calculator used in the mode decision circuit depicted in Fig. 8; 
Fig. 10 shows in blocks another feature quantity calculator used in the mode decision circuit depicted In Rg. 8; 
55 Fig. 1 1 shows in blocks a different feature quantity calculator for use in place of the feature quantity calculator illus- 
trated in Rg. 10; 

Fig. 12 is a block diagram of still another mode decision circuit for use in the speech signal encoder device shown 
in Rg. 6; 

Rg. 13 shows a feature quantity calculator used in the mode decision circuit depicted in Fig. 12; 
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Fig. 14 shows in blocks a different feature quantity calculator for use in place of the feature quantity calculator illus- 
trated in Fig. 12; 

Rg. 1 5 is a block diagram of yet another mode decision circuit for use in the speech encoder device shown In Fig. 6; 
Fig. 16 is a block diagram of a speech signal encoder device according to a axth embodiment of this invention; 
6 Rg. 1 7 is a block diagram of a pitch extracting circuit used in the speech signal encoder device illustrated in Rg. 1 6; 

Rg. 18 shews in fc)locks an additional feature quantity calculator used in the pitch extracting circuit depicted in Fig. 
17; 

Fig. 19 Is a block diagram of arK}ther pitch extracting circuit for use in the speech signal encoder device illustrated 
in Fig. 16; 

ro Fig. 20 shows in blocks another additional feature quantity calculator for use In the pitch extracting circuit depicted 
in Fig. 17; 

Rg. 21 is a block diagram of still another pitch extracting circuit for use in the speech signal encoder device illus- 
trated in Fig. 16; 

Rg. 22 shows In blocks an additional feature quantity calculator used in the pitch extracting circuit depicted in Fig. 
T5 21; 

Fig. 23 is a block diagram of yet another pitch extracting circuit for use in the speech signal encoder device illus- 
trated in Fig. 16; 

Fig. 24 shows in blocks an additional feature quantity calculator used in the pitch extracting circuit depicted in Fig. 
23; 

20 Fig. 25 is a block diagram of a speech signal encoder device according to a seventh embodiment of this invention; 
Fig. 26 is a block diagram of an RMS extracting circuit used in the speech signal encoder device illustrated in Fig. 
25; 

Fig. 27 is a block diagram of another RMS extracting circuit for use in the speech signal encoder device illustrated 
in Fig. 25; 

2s Fig. 28 is a block diagram of still another RMS extracting circuit for use In the speech signal encoder device illus- 
trated in Fig. 25; 

Fig. 29 is a block diagram of yet another EMS extracting circuit for use in tiie speech signal encoder device illus- 
trated in Fig. 25; and 

Fig. 30 is a block diagram of a further RMS extracting circuit for use in the speech signal encoder device illustrated 
30 in Fig. 25. 

Referring to Fig. 1, a speech signal encoder device is according to a first preferred embodiment of the present 
invention. An input speech or voice signal is supplied to the speech signal encoder device through a device input termi- 
nal 31 . The speech signal encoder device conrprises a multiplexer (MUX) 33 for delivering an encoder output signal to 
35 a device output terminal 35. 

Delivered through the device input terminal 31; tiie input speech signal is segmented or divided by a frame dividing 
circuit 37 into onginal speech frames at a frame period which is typically 5 ms long. A subf rame dividing circuit 39 fur- 
ther divides each original speech frame into original speech subframes, eacti having a subframe period of. for example. 
2.5 ms. 

40 Although connected in Rg. 1 to the frame dividing circuit 37, a spectral parameter calculator 41 calculates spectral 
parameters of the input speech signal up to a predetermined order, such as up to a tentii order (P = 10) by applying a 
window of a window length of typically 24 ms to at least one each of the speech subframes. In the exanple being illus- 
trated, tiie spectral parameter calculator 41 calculates the spectral parameters according to Burg analysis described in 
a book written by Nakamizo and published 1988 by Kbrona-Sya under the title of, as transliterated according to ISO 

45 3602, "Sing6 Kaiseki to Sisutemu Ddtei" (Signal Analysis and System Identification), pages 82 to 87. It is possible to 
use an LPC analyzer or a like as the spectral parameter calculator 41 . 

Besides calculating linear prediction coefficients a (t) by tiie Burg analysis for i = 1, 2 and 10, tiie spectral 

parameter calculator 41 converts the linear prediction coeffidents to LSP (linear spectral pair) parameters which are 
suitable to quantization and interpolation. In tiie spectral parameter calculator 41 being illustrated, the linear prediction 

so coefficients are converted to the LSP parameters according to a paper contributed by Sugamura and another to the 
Transactions of the Institute of Electronics and Communication Engineers of Japan, J64-A (1981), pages 599 to 606, 
under the titie of "Sen-supekutoru Jul Onsei Bunseki GOsei Hosiki ni yoru Onsei Zydhd Assyuku" (Speech Data Com- 
pression by LSP Speech Analysis-Synthesis Technique, as translated by the cortt^ibutors). 

More particularly, each speech frame consists of first and second subframes In the example being described. The 

55 linear prediction coefficients are calculated and converted to tiie LSP parameters for the second subframe. For the first 
subframe, the LSP parameters are calculated by linear interpolation of the LSP parameters of second subframes and 
are inverse converted to the linear prediction coefficients. In this manner, the spectral parameter calculator 41 produces 

LSP parameters and linear prediction coefficients a (i. p) for the first and the second subframes where p = 1 , 2 and 

5. 
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Supplied from the spectral parameter calculator 41 with the LSP parameters of each predetermined subframe, 
such as the second subframe. a spectral parameter quantizer 43 converts the linear prediction coefficients to converted 
prediction coefficients a '{i. p) for each subframe. Furthermore, the spectral parameter quantizer 43 vector quantizes 
the linear prediction coefficients. 

5 To speak of this vector quantization first, it is possible to use various known methods. An example is described In 

a paper contributed by ToshiyuW Hamada and three others to the Proc. Mobile Multimedia Communications, pages 
B.2.5-1 to B.2.5-4 (1933), under the title of "LSP Coding Using VQ-SVQ with Interpolation in 4.075 kbps M-LCELP 
Speech Coder". Other examples are disclosed in Japanese Patent Prepublication (A) Nos. 171,500 of 1992. 363,000 
of 1992. and 6.199 of 1993. In the example being illustrated, use is made of an LSP codebook 45. 

10 As for conversion into the corrverted prediction coefficients, the spectral parameter quantizer 43 first reproduces the 
LSP parameters for the first and the second subframes from the LSP parameters quantized in connection with each 
second subframe. In practice, the LSP parameters are reproduced by linear interpolation between the quantized pre- 
diction coefficients of a current one of the second subframes and those of a previous one of the second subframes that 
is one frame period prior to the current one of the second subframes. 

75 More in detail, the spectral parameter quantizer 43 is operable as follows. Rrst. a code vector is selected so as to 
minimize an en-or power between the LSP parameters before and after quantization and then reproduces by linear 
interpolation the LSP parameters for the first and the second subframes. In order to achieve a high quantization effi- 
ciency, it is possible to preselect a plurality of code vector candidates for minimization of the en-or power, to calculate 
cumulative distortions in connection witii tiie candidates, and to select one of combinations of interpolated LSP param- 

20 eters that minimizes tiie cumulative distortions. 

Alternatively, it is possible instead of the linear interpolation to prepare interpolation LSP patterns for a predeter- 
mined number of bits, such as two bits, and to select one of combinations of the interpolation LSP patterns that mini- 
mizes the cumulative distortions as regards the first and tiie second subframes. This results in an increase in an amount 
of output information although this makes it possible to more exactiy follow variations of the LSP parameters in each 

25 speed! frame. 

It is possible eitiier to prepare the interpolation LSP patterns by learning of LSP data for fraining or to store prede- 
termined patterns. For storage, tiie patterns may be tiiose described in a paper contributed by Tomohiko Taniguchi and 
three others to the Proc. ICSLP (1992). pages 41 to 44, under the titie of "Improved CELP Speech Coding at 4 kbit/s 
and below". Alternatively, it is possible for further improved performance to preselect the interpolation LSP patterns, to 
30 calculate an error signal between actual values of the LSP parameters and interpolated LSP values, and to quantize 
the error signal with reference to an error codebook (not shown). 

The spectral parameter quantizer 43 produces the converted prediction coefficients fc>r the subframes. In addition, 
the spectral parameter quantizer 43 supplies tiie multi-plexer 33 witii indexes indicative of the code vectors selected for 
quantized prediction coefficients in connection with the second subframes. 
35 Connected to tfie subframe dividing circuit 39 and to the specfral parameter calculator and quantizer 41 and 43, a 
perceptual weighting circuit 47 gives perceptual or auditory weights y ' to respective samples of the speech subframes 
to produce a perceptually weighted signal x[w](n). where n represents sample identifiers of the respective speech sam- 
ples in each frame. The weights are decided primarily by tine linear prediction coefficients. 

Supplied with tiie perceptually weighted signal frame by frame, a mode decision circuit 49 extracts feature quanti- 
se? ties from tine perceptually weighted signal. Furtiiermore. tiie mode decision circuit 49 uses tiie feature quantities in 
deciding modes as regards frames of the perceptually weighted signal to produce decided mode results indicative of 
the modes. 

Turning temporarily to Fig. 2 with Fig. 1 continuously refen-ed to. the mode decision circuit 49 is operable as follows 
in tiie speech encoder device being illusfrated. The mode decision circuit 49 has mode decision circuit input and output 
45 terminals 49(1) and 49(0) supplied with the perceptually weighted signal and producing the decided mode results. 

Supplied tiirough tiie mode decision circuit input terminal 49(1) with the perceptually weighted signal frame by 
frame, a feature quantity calculator 51 calculates in this example a pitch prediction gain G. A frame delay (D) 53 is for 
giving one frame delay to the pitch prediction gain to produce a one-frame delayed gain. A weighted sum calculator 55 
calculates a weighted sum Gav of the pitch prediction gain and the one-frame delayed gain according to: 

so 

2 

Gav = £ V O)G(i). 

55 

where v (i) represents gain weights for i-th subframes. 

The feature quantities are given typically by sudi weighted sums in connection with each cuaent frame and a pre- 
vious frame which is one frame period prior to the current frame. Supplied with the feature quantities, a mode decision 
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unit 57 selects one of the modes for each current frame and delivers the decided mode results in successive frame peri- 
ods to the mode decision circuit output terminal 49(0). 

The mode decision unit 57 has a plurality of predetermined threshold values, for exanrple, three in nuni^er. In this 
event, the modes are four in number. The decided nnode results are delivered to the multiplexer 33. 

In Rg. 1 , the qaectral parameter calculator and quantizer 41 and 43 supply a response signal calculator 59 with the 
linear prediction coefficients subframe by subframe and with the converted prediction coefficients also sutiframe by 
subframe. The response signal calculator 59 keeps filter memory values for respective subframes. In response to a 
response ceUculator input signal d(n) which will presently become clear, the response signal calculator 59 calculates a 
response signal x[z](n] for each subframe according to: 

10 10 10 

x[z](n) = d(n) - £ a (i)d(n - i) + £ a ©y ' y(n - i) + £ a •(i)Y ' x[z](n -0. 

i-1 Ia1 1.1 



where: 

10 10 

y(n) = d(n) - £ a (i)d(n - i) + £ a (Oy * y(n - i). 

tl l«i 



Connected to the perceptual weighting circuit 47 and to the response signal calculator 59, a ^eech subframe sub- 
tracter 61 subtracts the response signal from the perceptually weighted signed to produce a sut>frame difference signal 
according to: 

x[wr{n) = x[w](n)-x[z](n). 

Connected to the spectral parameter quantizer 45, an Impulse response calculator 63 calculates, at a predetermined 
number L of points, inpulse responses h[w](n) of a weighted fitter of the z-transfbrm which is represented as: 

10 10 

H[w](z) = (1 - a (i)z-') (1 - £ a XDy ' z 'Y • 

U1 U1 



Controlled by the modes decided by the mode decision circuit 49 and by the impulse responses calculated by the 
impulse response calculator 63, an adaptive codebook circuit 65 is connected to the subframe subtracter 61 and to a 
pattern accumulating circuit 67. Depending on the modes, the adaptive codebook circuit 65 calculates pitch parameters 
and supplies the multiplexer 33 with a prediction difference signal defined by: 

z(n)=x[w]'(n)-b(n), 

where b(n) represents a pitch prediction signal given by: 

b(n) = p v(n - T) * h[w](n). 

where, in turn, p represents the gain of the adaptive codebook circuit 65, v(n) representing here an adaptive code vec- 
tor, and T representing a delay. The asterisk mark represents convolution. 

Controlled by the modes decided by the mode decision circuit 49 and by the impulse responses calculated by the 
impulse response calculator 63, an excitation quantizer 69 is supplied with the prediction difference signal from the 
adaptive codebook circuit 65 and refers to a sparse excitation codebook 71. Being of a non-regular pulse type, the 
sparse excitation codebook 71 keeps excitation code vectors, each of which is composed of non-zero vector compo- 
nents of an individual non-zero number or count. The excitation quantizer 69 produces, as optimal excitation code vec- 
tors cDI(n). either a part or all of the excitation code vectors to minimize j-th differences defined by: 

DG) - £ [zCn) - Y G)c[Q(n)h[w](n)]^ 

n 
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Controlled by the impulse responses calculated by the Impulse response calculator 63 and supplied wrth the pre- 
diction difference signal from the adaptive codebook circuit 65 and with the excitation code vectors selected by the exci- 
tation quantizer 69. a gain quantizer 73 refers to a gain codebook 75 of gain code vectors. Reading the gain code 
vectors, the gain quantizer 73 selects combinations of the excitation code vectors and the gain code vectors so as to 
5 minimize Q.k)-th differences defined by: 



DQ,k) o MwKn) - P '(kMn - T)Kw](n) - y WcDKn^wKn)] ^ 
n 

10 

where p '(k) and y '(k) represent a k-th two-dimensional code vector of the gain code vectors. Selecting the combina- 
tions, the gain quantizer 73 supplies the multiplexer 33 with the indexes indicative of the excitation and the gain code 
vectors of such selected combinations. 

In the Ozawa et al paper cited heretobefore. ttie excitation quantizer 69 selects at least two kinds, such as for an 
IS unvoiced and a voiced mode, of optimal excitation code vectors. In the example being illustrated, ttie gain quantizer 73 
selects tiie optimal code vectors produced by the excitation quantizer 69 under control by the modes. It is possible upon 
selection by the gain quantizer 73 to specify tiie optimal excitation code vectors of a single kind. Alternatively, it is pos- 
sible on applying tiie above<Jescribed equation for the j-th differences DQ) only to a part of tiie excitation code vectors 
to preliminarily select excitation code vector candidates for application of tiie equation in question to tiie excitation code 
20 vector candidates, to select the optimal code vectors of only one kind from tiie excitation code vector candidates. 

Connected to tiie spectral parameter calculator and quantizer 41 and 43 and to the gain quantizer 73. a weighting 
signal calculator 77 reads the excitation and the gain code vectors witii reference to their indexes and calculates a drive 
excitation signal v(n) according to: 

25 vCn) = p '{n)v(n - T) + y '(kjcDKn). 

Subsequentiy. the weighting signal calculator 77 calculates a weighting signal s[w](n) for delivery to the response 
signal calculator 59 according to: 

30 10 10 10 

s[w](n) = v(n) - a (i)v(n - i) + £ a {i)y ' p(n - 1) + a '(i)y ' s[w](n - i). 



35 where: 

10 10 

p(n) = v(n) - a (i)v(n - i) + £ a (i)Y ' p(n - i). 
bl ui 

40 

It is now understood in connection with the example being illustrated that the modes are decided either for each 
original speech frame or for each weighted speech frame by the feature quantities extracted from tiie input speech sig- 
nal for a longer period which is longer than one frame period. Even if ttie frame period is only 5 ms long or shorter and 

45 if the feature quantities may be erroneous when extracted from tiie cun-ent speech frame alone, the previous speech 
frame would give correct and precise feature quantities when the previous speech frame is at least one frame period 
prior to tiie current speech frame. As a consequence, it is possible for unstable and erroneous interswitching of ttie 
modes to prevent the code quality from deteriorating. 

Referring to Fig. 3 witii Figs. 1 and 2 continuously referred to. another mode decision circuit is for use in a speech 

so signal encoder device according to a second prefenred embodiment of this invention. Throughout the following, similar 
parts are designated by like reference numerals and are similarly operable witti likewise named signals unless specifi- 
cally otherwise mentioned. This mode dedsion circuit is therefore designated by the reference numeral 49. Except for 
the mode decision circuit 49 which will be described in the following, the speech signal encoder device is not different 
from that illustrated with reference to Rg. 1. 

55 In ttie mode decision drcuit 49 being illustrated, the frame delay 53 is connected directiy to the mode decision cir- 
cuit input terminal 49(1). Supplied from the perceptual weighting circuit 47 with tiie perceptually weighted signal ttirough 
the mode decision circuit input terminal 49(1). the frame delay 53 produces a delayed weighted signal wrtti a one-frame 
delay 
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Connected to the frame delay 53 and to the mode decision circuit input terminal 49(1), the feature quantity calcula- 
tor 51 calcLdates a pitch prediction gain G for each speech frame as the feature quantities. The pitch precfiction gain is 
calculated according to: 

G= 10logio(P/E). 

where: 

N-1 

P= £ x[wl^n) 

n— N+1 

and 

N-1 N-1 

E = P-[ £ x[w](n)x[w](n-T)]'-[ £ x[w]^(n-T)L 

n— N+1 n— N+1 



where, in turn, T represents here an optimal delay that maximizes such prediction delays. N representing a total number 
of speech sanples in each frame. 

Connected to the feature quantity calculator 51 , the mode decision unit 57 compares the pitch prediction gain with 
predetermined threshold values to decide modes of the input speech signal from frame to frame. The modes are deliv- 
ered as decided mode results through the mode decision circuit output terminal 49(0) to the multiplexer 33, the adap- 
tive codebook circuit 65» and the excitation quantizer 69. 

In the speech signal encoder device including the mode decision circuit 49 being illustrated, mode information is 
produced as an average for more than one frame period. This makes it possible to suppress deterioration which would 
otherwise take place in the code quality. 

Further turning to Fig. 4 with Figs. 1 and 2 continuously referred to, a pitch extracting circuit is for use in a speech 
signal encoder device according to a third preferred embodiment of this invention. The pitch extracting circuit is used in 
place of the nnode deciding circuit 49 and is therefore designated by a similar reference symbol 4g(A). In other respects, 
the speech signal encoder device is not much different from that illustrated with reference to Fig. 1 except for the adap- 
tive codebook circuit 65 which is now operable as will shortly be described. 

In Rg. 4. pitch extracting circuit input and output terminals correspond to the mode decision circuit input and output 
terminals 49(1) and 49(0) described in conjunction with Fig. 2 and are consequently designated by the reference sym- 
bols 49(1) and 49(0). The pitch extracting circuit 49(A) connprises the frame delay 53 connected directiy to the pitch 
extracting circuit input terminal 49(1) as in tiie mode decision circuit 49 described with reference to Fig. 3. 

Connected to the frame delay 53 and to tiie pitch exfa^acting circuit input terminal 49(1) is a pitch calculator 79. Sup- 
plied from the perceptual weighting circuit 47 through the pitch extracting circuit input terminal 49(1) with the perceptu- 
ally weighted signal as an undelayed weighted signal and from the frame delay 53 witii the delayed weighted signal, the 
pitch calculator 79 calculates pitches T (the same reference symbol being used) which maximizes a novel error power 
E(T) defined by: 

N-1 N-1 N-1 

E(T)= £ x[w]^(n)-[ £ x[wl(n)xIw](n-T)]2-[ £ x[w]2(n-T)]. 
n=-N+1 n=-N+1 na-N+1 



Extracting the pitches T from the input speech signal in this manner, the pitch extracting circuit 49(A) delivers the 
pitches to the adaptive codebook circuit 65. Although connections are depicted in Fig. 1 between the mode deciding 
circuit 49 and the multiplexer 33 and between the mode deciding circuit 49 and the excitation quantizer 69. it is unnec- 
essary for the pitch extracting circuit 49(A) to deliver the pitches to the multiplexer 33 and to the excitation quantizer 69. 

Supplied from the pitch extracting circuit 49(A) with the pitches, the adaptive codebook unit 65 closed-loop 
searches for lag parameters near the pitches in the subframes of the subframe difference signal. Furtiiermore, the 
adaptive codebook circuit 65 carries out pitch prediction to produce the prediction difference signal z(n) described 
before. 

It has been confirmed that the pitch extracting circuit 49(A) is excellently operable. In tiie Ozawa et al paper cited 
before, the pitches T are calculated so as to minimize a conventional error power defined by: 
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N-1 N-1 N-1 

E(T) = £ xtw] ^(n) . [£ xtw](n)x[w](n - "0] ^ x[w]^(n - T)]. 

5 In contrast, the pitch extracting circuit 49(A) calculates for each original or weighted speech frame an averaged 

pitch over two or more frame periods. This avoids extraction of unstable and en-oneous pitches and prevents the code 
quality from being inadvertently deteriorated. 

Referring afresh to Fig. 5. a speech signal encoder device Is similar, according to a fourth preferred embodiment of 
this invention, to that illustrated with reference to Figs. 1 and 4. 

10 Between the perceptual weighting unit 47 and the mode decision unit 57 which is described In connection with Rg. 
3, use is made of a pitch and pitch prediction gain (T & G) extracting circuit 49(B) connected to the adaptive codebook 
circuit 65. Instead of the sparse excitation codebook 71. first through N-th sparse excitation codebooks 71(1) through 
71 (N) are connected to the excitation quantizer 69. 

It is possible to understand that Fig. 4 shows also the pitch and pitch prediction gain extracting circuit 49(B). A pitch 

75 and predicted pitch gain extracting circuit input terminal is connected to the perceptual weighting circuit 47 to corre- 
spond to the mode decision or the pitch extracting circuit input terminal and is designated by the reference symbol 49(1). 
A pitch and pitch prediction gain calculator 79(A) is connected to the frame delay 53 like the pitch gain calculator 79 and 
calculates the pitches T to maximize the novel error power defined before and the pitch prediction gain G by using the 
equation which is given before and in which E is clearly equal to the novel error power. In the manner understood from 

20 Rg. 5, the pitch and pitch prediction gain extracting unit 49(B) has two pitch and pitch prediction gain extracting circuit 
output terminals connected to the pitch and pitch prediction gain calculator 79(A) instead of only one pitch extracting 
circuit output terminal 49(0). 

One of these two ou^put tenmlnals is for the pitches T and is connected to the adaptive codebook circuit 65. The 
other is for the pitch prediction gain G and is connected to the mode decision circuit 49. which uses such pitch prediction 

25 gains as the feature quantities. 

The adaptive codebook circuit 65 is controlled by the modes and is operable to closed-loop search for the lag 
parameters in the manner described above. The excitation quantizer 69 uses either a part or all of the excitation code 
vectors stored in the first through the N-th excitation codebooks 71 (1 ) to 71 (N). 

Referring now to Fig. 6. the description will proceed to a speech signal encoder device according to a fifth preferred 

30 embodiment of this invention. This speech signal encoder device is similar to that illustrated with reference to Fig. 1 
except for the following. That is. the mode decision circuit 49 is supplied from the spectral parameter calculator 41 with 
the spectral parameters a (I, p) for the first and the second subframes besides supplied from the perceptual weighing 
circuit 47 with the weighted speech subframes x[w](n) at the frame period. 

Turning to Fig. 7 with Fig. 6 continuously referred to. the mode decision circuit 49 has first and second circuit input 

35 terminals 49(1 ) and 49(2) connected to the perceptual weighting circuit 47 and to the spectral parameter calculator 41 . 
respectively Corresponding to the mode decision circuit output terminal described in connection witii Fig. 2. a sole cir- 
cuit output terminal is designated by the reference symbol 49(0) and connected to the multiplexer 33 and to the adap- 
tive codebook circuit 65 and the excitation quantizer 69. 

Connected to the first circuit input terminal 49(1), a first feature quantity calculator 81 calculates primary feature 

40 quantities, such as the pitch prediction gains which are described before and will hereafter be indicated by PG. Con- 
nected to the first and the second circuit input terminals 49(1) and 49(2). a second feature quantity calculator 83 calcu- 
lates secondary feature quantities which may be short-period or short-term predicted gains SG. 

Supplied with the primary and the secondary feature quantities and with delayed mode information tinrough a frame 
delay 85, a mode decision unit 87 selects one of the modes for each current frame as output mode information like the 

45 mode decision unit 57 described in conjunction with Fig. 2 by comparing a combination of the primary and the second- 
ary feature quantities and the delayed mode Information with the predetermined threshold values of the type described 
before. The output mode information is delivered to the sole circuit output terminal 49{0) and to the frame delay 85, 
which gives a delay of one frame period to supply the delayed mode information back to the mode decision unit 87. It 
is prefenred that the combination of the delayed mode information and the primary and the secondary feature quantities 

so should be a weighted combination of the type of the weighted sum Gav described in connection with Fig. 2. 

In other respects, operation of this speech signal encoder device is not different from that described in conjunction 
with Fig. 1. It is possible with the mode decision circuit 49 described with reference to Fig. 7 to achieve the above- 
pointed out technical merlt& 

Referring to Fig. 8. another mode decision circuit is for use in the speech signal encoder device described In the 
55 foregoing and Is designated again by the reference numeral 49. 

As illustrated vrith reference to Rg. 7. this mode decision circuit 49 has the first and the second circuit irput termi- 
nals 49(1) and 49(2) and the sole circuit output terminal 49(0) and comprises the first and the second feature quantity 
calculators 81 and 83, the frame delay 85, and the mode decision unit 87. Operable in the manner described in con- 
junction with Rg. 7, the first feature quantity calculator 81 delivers the pitch prediction gains PQ to the mode decision 
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unrt 87. In the example being illustrated, the second feature quantity calculator 83 Is supplied only with the weighted 
speech subframes and calculates, for supply to the mode decision unit 87, RMS ratios RR as the secondary feature 
quantities in the manner which will presently be described. Connected to the first and the second circuit input terminals 
49(1) and 49(2) and being operable as will shortly be described, a third feature quantity calculator 89 calculates, for 
6 delivery to the mode decision unit 87, the short-period predicted gains SG and short-period predicted gain ratios SGR 
collectively as ternary feature quantities. The frame delay 85 and the mode decision unit 87 are operable in the manner 
described above. 

Turning to Fig. 9 arxi Figs. 6 and 8 again referred to, the second feature quantity calculator 83 conrprises an RMS 
calculator 91 supplied with the weighted speech subframes frame by frame through the first circuit Input terminal 49(1) 

10 to calculate RMS values R which are used in the Ozawa et al paper. Connected to the RMS calculator 91 , a frame delay 
(D) 93 gives a delay of one frame period to the RMS values to produce delayed values. Supplied with the RMS values 
and the delayed values, an RMS ratio calculator 95 calculates the RMS ratios for defrvery to the mode decision unit 87. 
Each RMS ratio is a rate of variation of the RMS values with respect to a time axis scaled by the frame period. 

Further turning to Rg. 10 with Figs. 6 and 8 continuously referred to, the third feature quantity calculator 89 com- 

15 prises a short-period predicted gain (SG) calculator 97 connected to the first and the second circuit input terminals 
49(1) and 49(2) to calculate the short-period predicted gains for ^pply to the mode decision unit 87. Although sepa- 
rated from the frame delay described in conjunction with Fig. 9, a frame delay (D) is indicated by the reference numeral 
93 merely for convenience of illustration and is similariy operable to produce delayed prediction gains which are related 
to the previous frame described before. Responsive to the short-period prediction gains and to the delayed prediction 

20 gains, a short-period prediction gain ratio (SGR) calculator 99 calculates the short-period predicted gain ratios for deliv- 
ery to the mode decision unit 87. 

Still further turning to Rg. 1 1 with Figs. 6 and 8 continuously refen^ed to, the third feature quantity calculator 89 com- 
prises first and second frame delays 93(1) and 93(2) in place of the frame delay 93 depicted in Fig. 9. As a conse- 
quence, the third feature quantity calculator 89 supplies the mode decision unit 87 with the short-period predicted gains 

25 which are calculated by conrparing tiie predetermined threshold values with a sum, preferably a weighted sum, calcu- 
lated in each frame by a short-period predicted gain and a delayed predicted gain delivered from the first and the sec- 
ond frame delays 93(1) and 93(2) with a total delay of two frame periods given to the short-period predicted gain. 

Referring to Fig. 12 with Fig. 6 continuously referred to. the mode decision circuit 49 is similar partly to that 
described in connection with Rg. 8 and partiy to that of Fig. 9. More particulariy, the second feature quantity calculator 

30 83 supplies the mode decision unit 87 with the RMS values R in addition to the RMS ratios RR. The first and tiie tiilrd 
feature quantity calculators 81 and 89, the frame delay 85. and the mode decision unit 87 are operable in the manner 
described before. 

Turning to Fig. 13 with Fig. 12 continuously referred to, the second feature quantity calculator 83 is similar to that 
illustrated with reference to Fig. 9. The RMS calculator 91 delivers, however, the RMS values directly to the mode deci- 

35 sion unit 87. In addition, the RMS calculator 91 delivers the RMS values to tiie RMS ratio calculator 95 directiy and 
through a series connection of first and second frame delays (D) which are separate from those described in connection 
with Fig. 1 1 and nevertheless are designated by the reference numerals 93(1) and 93(2). It is now understood that the 
RMS ratio calculator 95 calculates the RMS ratio of each current RMS value to a previous RMS value which is two 
frame periods prior to the cun'ent RMS value. 

40 Further turning to Fig. 1 4 with Figs. 6 and 1 2 again referred to. the second feature vector calculator 83 is similar to 
that desaibed with reference to Fig. 9. The RMS calculator 91 delivers, however, the RMS values directiy to the nxxJe 
decision unit 87 besides to the frame delay 93 and to the RMS ratio calculator 95. 

Referring to Fig. 15 with Fig. 6 continuously referred to, tiie mode decision circuit 49 is similar to that described with 
reference to Rg. 12. The second feature quantity calculator 83 delivers, however, only the RMS values R to the nrKXie 

45 decision unit 87. 

Referring now to Fig. 16. attention will be directed to a speech signal encoder device according to a sixth preferred 
embodiment of tiiis invention. In this speech signal encoder device, the mode decision circuit 49 is supplied only from 
the perceptual weighting circuit 47 with the weighted speech subframes at the frame period, calculates the pitch pre- 
diction gains as the feature quantities like the first feature quantity calculator 81 described in conjunction with Fig. 7. 8, 
so 12, or 15. and decides the mode information of each original speech frame for delivery to the multiplexer 33, the adap- 
tive codebook circuit 65. and the excitation quantizer 69. In the example being illustrated, the mode information is addi- 
tionally used in tiie manner which will be described in the following. 

Connected to the perceptual weighting circuit 47. supplied from the mode decision circuit 49 with the mode infor- 
mation at the frame period, and accompanied by a partial feedback loop 101. a pitch extracting circuit 103 calculates 
55 corrected pitches CPP in each frame period for supply to the adaptive codebook circuit 65 as follows. 

Turning to Fig. 17 with Rg. 16 continuously referred to, the pitch exb-acting circuit 103 has a first extracting circuit 
input terminal 103(1) connected to the mode decision circuit 49, a second extracting circuit input terminal 103(2) con- 
nected to the perceptual weighting circuit 47. and a third extracting circuit input terminal 1 03(3) connected to the partial 
feedback loop 101 . An extracting circuit output terminal 103(0) is connected to the adaptive codebook circuit 65. 
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In the manner which will presently be described, the partial feedback loop 101 feeds a cun-ent pitch CP of each cur- 
rent frame to the third extracting circuit input terminal 103(3). An additional feature quantity calculator 105 calculates 
such cunrent pitches, previous pitches PP, and pitch ratios DR in response to the current pitches and to the weighted 
speech subframes supplied thereto at tiie frame period. The previous pitches have a common delay of one frame period 

5 relative to the current pitches. Each pitch ratio represents a rate of variation in the current prtches in each frame period. 
Connected to tiie first extracting circuit Input terminal 103(1). a frame delay (D) 107 gives a delay of one frame 
period to produce delayed information. Supplied from tiie first extracting circuit input terminal 103(1) with the mode 
information, from the frame delay 107 with the delayed irtformation, and from the additional feature quantity calculator 
1 05 with the cunrent pitches, the previous pitches, and the pitch ratios collectively as feature quantities, a feature quan- 

w tity adjusting unit 109 compares the pitch ratios witii a predetermined additional threshold value witfi reference to tfie 
mode and the delayed infonrration to adjust or correct the cunrent pitches by tiie previous pitches arxJ the pitch ratios 
into adjusted pitches CPP for delivery to the extracting circuit output terminal 1 03(0). 

Furtiier turning to Rg. 1 8 witii Figs. 1 6 and 1 7 continuously referred to. the additional feature quantity calculator 105 
conprises a pitch calculator 1 1 1 connected to the first extracting circuit input terminal 1 03(2) to receive tiie perceptually 

15 weighted speech subframes at tine frame period and to calculate the cunrent pitches CP for delivery to the partial feed- 
back loop 101 and to the feature quantity adjusting unit 109. Supplied with the current pitches through ttie second 
extracting circuit input terminal 103(2), a frame delay (D) 1 1 3 produces ttie previous pitches PP for supply to the feature 
quantity adjusting unit 109. Supplied with the current and the previous pitches, a pitch ratio calculator 1 15 calculates 
the pitch ratios DR for supply to the feature quantity adjusting unit 109. 

20 In Rg. 1 6, tiie adaptive codebook circuit 65 is operable similar to tfiat described in conjunction witii tiie speech sig- 
nal encoder device comprising the pitch calculator 79 illustrated with reference to Fig. 4. More specifically, the adaptive 
codebook circuit 65 closed-loop searches for ttie pitches in each previous subframe of the subframe difference signal 
near tiie adjusted pitches CPP rather than tiie lag parameters near ttie pitches calculated by the pitch calculator 79. 
In otfier respects, the speech signal encoder device of Fig. 15 is similar to that illustrated with reference to Fig. 6. 

25 Referring to Fig. 19 witii Fig. 15 additionally refen-ed to, anotiier pitch extracting circuit is for use in the speech sig- 
nal encoder device under consideration. This pitch extracting circuit corresponds to that illusti-ated with reference to Fig. 
1 7 and will be designated by the reference numeral 103. 

The pitch extracting circuit 103 has only the first and the second exti-acting circuit input terminals 103(1) and 103(2) 
and the extracting circuit output terminal 103(0). In otiier words, tiie pitch extiacting circuit 103 is not accompanied by 

30 the partial feedback loop 101 described in connection with Fig. 1 6. 

Supplied from the perceptual weighting circuit 47 witii tiie weighted speech subframes frame by frame, the addi- 
tional feature quantity calculator 105 calculates tiie current pitches CP as the feature quantities. Responsive to the 
mode information supplied from the mode decision circuit 49 frame by frame and to the delayed information produced 
by the frame delay 1 07, the feature quantity adjusting unit 1 09 adjusts the current pulses into the adjusted pitches CPP 

35 for use in tiie adaptive codebook circuit 65. 

Referring to Fig. 20 witii Figs. 16 and 17 additionally refen-ed to. another additional feature quantity calculator is for 
use in the pitch extracting circuit 1 03 accompanied by the partial feedback loop 1 01 and is designated by tiie reference 
numeral 105. This additional feature quantity calculator 105 is similar to that illustrated witii reference to Fig. 18. In the 
additional feature quantity calculator 105 being illustrated, ttie frame delay 1 13 of Fig. 18 is afresh refen-ed to as a first 

40 frame delay 1 13(1) and delivers the previous pitches PD to the feature quantity adjusting unit 109. 

Supplied through the second extracting circuit input terminal 103(2) with the perceptually weighted speech sub- 
frames at the frame period, the pitch calculator 111 calculates tiie current pitches CP for supply to the feature quantity 
calculating unit 109 and to the partial feedback loop 101 and tiience to tiie tfiird extracting circuit input terminal 103(3) 
depicted in Fig. 18. Connected in series to tiie first frame delay 113(1). a second delay 1 13(2) gives a delay of one 

46 frame period to the previous pitdies to produce past previous pitches PPP which have a long delay of two frame periods 
relative to the current pitches. So as to deliver the pitch ratios DR to the feature quantity adjusting unit 109. the pitch 
ratio calculator 1 15 is operable identically with tiiat described in connection with Fig. 18. 

Referring to Rg. 21 with Fig. 16 continuously referred to. the pitch extracting circuit 103 is for use in combination 
with the partial feedback loop 1 01 . Supplied with the mode information frame by frame tiirough the first extracting circuit 

so input terminal 1 03(1 ). with the perceptually weighted speech subframes frame by frame through the second extracting 
circuit input ternranal 103(2), and with the cun-ent pitches CC through the third extracting circuit input terminal 103(3), 
this pitch extracting circuit 103 delivers the adjusted pitches CPP to the adaptive codebook circuit 65 through the 
extracting circuit output terminal 103(0). 

Connected to the second and the third extracting circuit input terminals 103(2) and 103(3), an additional feature 

55 quantity calculator is similar to that descrbed with reference to any one of Figs. 1 7 through 20 and is consequentiy des- 
ignated again by the reference numeral 1 05. Responsive to the perceptually weighted speech subframes of each frame 
and to the cunrent pitches, this additional feature quantity calculator 105 calculates the pitch ratios DR for delivery 
together with the current pitches to the feature quantity adjusting unit 109 collectively as the feature quantities. Respon- 
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sive to the mode and the delayed informatbn, the feature quantity adjusting unit 107 compares the pitch ratios with the 
additional threshold value to adjust the current pitches now only by the pitch ratios into the adjusted pitches. 

Turning to Rg. 22 with Figs. 1 6 and 21 continuously referred to, the additional feature quantity calculator 1 05 is sim- 
ilar to that illustrated with reference to Figs. 18 or 20. The previous pitches are. however, not supplied to the feature 

6 quantity adjusting unit 1 09. 

Referring again to Rg. 22 with Figs. 16 and 21 additionally referred to, the additional feature calculator 105 may 
comprise. Instead of the first and the second frame delays 1 13(1) and 1 13(2), singly the frame delay 1 13 between the 
third extracting circuit input terminal 1 03(3) and the pitch ratio calculator 1 1 5 as in Fig. 1 8 and without supply of the pre- 
vious pitches to the feature quantity adjusting unit 109. 

10 Referring anew to Fig. 23 witii Rg. 16 continuously referred to, tiie pitch extracting circuit 103 is not different from 
that of Fig. 21 insofar as depicted in blocks. The addKional feature quantity calculator 105 Is, however, a littie different 
from that described in conjunction with Fig. 21 . Accordingly the feature quantity adjusting unit 1 09 is somewhat differ- 
entiy operable. 

Turning to Rg. 24 with Figs. 16 and 23 continuously referred to, tiie additional feature quantity calculator 105 com- 

is prises the pitch calculator 111 supplied through the second extracting circuit Input terminal 103(2) with the perceptually 
weighted speech sut>frames at the frame period to deliver the current pitches CC to the partial feedback loop 101 and 
to the feature quantity adjusting unit 109. The frame delay 1 13 is supplied with tiie current pitches CP through the third 
extracting circuit Input terminal 103(3) to supply the previous pitches PP to the feature quantity adjusting unit 109. 
Turning back to Fig. 23, the feature quantity adjusting unit 109 is operable as follows. In response to the mode and 

20 the delayed information supplied tiirough tiie first extracting circuit input terminal 103(1) directiy and additionally 
through the frame delay 1 07, the feature quantity adjusting unit 1 09 compares the previous pitches with predetermined 
further additional threshold values to adjust the current pitches by the previous pitches Into the adjusted pitches CPR 
Referring afresh to Rg. 25. the description will proceed to a speech signal encoder device according to a seventh 
preferred embodiment of this invention. This speech signal encoder device is different as follows from that illustrated 

25 with reference to Fig. 5. 

In the manner described referring to Figs. 6 and 7, 8, 12, or 15. the mode decision circuit 49 calculates the pitch 
prediction gains at the frame period and decides the mode information. In the manner described in the Ozawa et al 
paper, an RMS extracting drcuit 121 is connected to the frame dividing circuit 37 and is accompanied by an RMS code- 
book 123 keeping a plurality of RMS code vectors. Controlled by the mode information specifying one of the predeter- 

30 mined modes for each of the original speech frames into which the input speech signal is segmented, the RMS 
extracting circuit 121 selects one of the RMS code vectors as a selected RMS vector for delivery to tiie multiplexer 33 
and therefrom to the device output terminal 35. The RMS extracting circuit 121 serves as a level extracting anange- 
ment. 

Turning to Rg. 26 with Fig. 25 continuously referred to, the RMS extracting circuit 121 has a first extracting circuit 

35 input terminal 121(1) supplied from the mode decision circuit 49 with the mode information as current mode information 
at the frame period. Connected to the frame dividing circuit 37, a second extracting circuit input terminal 1 21 (2) is sup- 
plied with the original speech frames. A third extracting circuit 121(3) is for referring to the RMS codebook 123. An 
extracting drcuit output terminal 123(0) is for delivering the selected RMS vector to the multiplexer 33. 

Connected to the second extracting circuit input terminal 121(2). an RMS calculator 125 calculates the RMS values 

40 R like the RMS calculator 91 described in conjunction with Fig. 9, 13, or 14. Responsive to the current mode Information 
and to previous mode information supplied from the first extracting circuit input terminal 121(1) directly and through a 
frame delay (D) 127. an RMS adjusting unit 129 compares the RMS values fed from the RMS calculator 125 as original 
RMS values with a predetermined still further additional threshold value to adjust the original RMS values into adjusted 
RMS values IR. Connected to the RMS adjusting unit 129 and to the third extracting drcuit input terminal 121(3), an 

45 RMS quantization vector selector 131 selects one of the RMS code vectors that is most similar to the adjusted RMS 
values at each frame period as the selected RMS vector for delivery to the extracting circuit output terminal 121(0). 

Further turning to Fig. 27 witii Fig. 25 continuously referred to, the RMS extracting drcuit 121 additionally com- 
prises an additional frame delay 1 33 si4}plied from the RMS adjusting unit 129 with the adjusted RMS values as current 
adjusted values to supply previous adjusted values back to the RMS adjusting unit 129. Responsive to the current and 

so the previous mode Information and to the previous adjusted values, the RMS adjusting unit 129 adjusts the original 
RMS values into the adjusted RMS values. 

Still further turning to Fig. 28 with Rg. 25 continuously referred to. the RMS extracting drcuit 121 is different from 
that illustrated wHh reference to Rg. 27 in that the previous adjusted values are not fed back to the RMS adjusting unit 
129. Instead, the additional frame delay 133 delivers the previous adjusted values to an RMS ratio calculator 135 which 

55 Is supplied from the RMS calculator 125 with tiie original RMS values to calculate RMS ratios RR for feed back to the 
RMS adjusting unit 1 29. In connection with the RMS ratios, it should be noted that the previous adjusted values are pro- 
duced by the additional frame delay 133 concun-entiy with previous RMS values which are the original RMS values 
delivered one frame period earlier from the RMS calculator 125 to the RMS adjusting unit 129 than the previous 
adjusted values under consideration. Each RMS ratio is a ratio of each original RMS value to one of the previous 
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adjusted values that is produced by the additional frame delay 133 concurrently with the previous RMS value one frame 
period earlier than the above-mentioned each original RMS value. 

The RMS adjusting unit 1 29 is now operable like the feature quantity adjusting unit 1 09 described by again refening 
to Rg. 22. More in detail, the RMS adjusting unit 129 produces the RMS adjusted values IR by comparing the original 
RMS values R with the still further additional threshold value in response to the cun-ent and the previous mode informa- 
tion and the RMS ratios. 

Referring to Fig. 29 witii Rg. 25 continuously referred to, the RMS extracting circuit 121 conprises the RMS adjust- 
ing unit 1 29 which is additionally supplied from the additional frame delay 133 with the previous adjusted values besides 
the original RMS values and the RMS ratios. The RMS adjusting unit 129 is consequently operable like the feature 
quantity adjusting unit 109 described in conjunction with Rgs. 17 and 18. More particularly, the RMS adjusting unit 129 
produces the RMS adjusted values IR by comparing the original RMS values with the still further additional threshold 
value to adjust the cuffent RMS values by the previous adjusted values In response to the current and the previous 
mode information and the RMS ratios. 

Turning to Rg. 30 with Fig. 25 continuously referred to. the RMS extracting circuit 121 is different from that illus- 
trated with reference to Fig. 28 in tiiat the additional frame delay 133 of Fig. 28 is changed to a series connection of first 
and second frame delays 133(1) and 133(2). The RMS ratio calculator 135 calculates RMS ratios of the cun-ent RMS 
values to past previous RMS adjusted values produced by the RMS adjusting unit 1 29 in response to RMS values which 
are two frame periods prior to the current RMS values. The RMS adjusting unit 1 29 is operable in the manner described 
as regards the RMS extracting circuit 121 illustrated with reference to Rg. 28. It should be noted in this connection that 
the RMS ratios are different between the RMS adjusting units described in conjunction witii Rgs. 28 and 30. 

Referring once more to Figs. 29 and 30 with Fig. 25 continuously referred to, the RMS extracting circuit 121 may 
comprise the first and the second additional frame delays 133(1) and 133(2) and a signal line between the first addi- 
tional frame delay 133(1) and the RMS adjusting unit 129 in the manner depicted in Fig. 29. The RMS ratio calculator 
135 is operable as described in connection with Fig. 30. The RMS adjusting unit 129 is operable as described in con- 
junction with Fig. 29. 

Claims 

1. A speech signal encoder device connprising segmenting means (31) for segmenting an input speech signal into 
original speech frames at a predetermined frame period, deciding means (49) for using said original speech frames 
in deciding a predetermined number of modes of said original speech frames to produce decided mode results, and 
encoding means (65. 69, 73, 33) for encoding said input speech signal into codes at said frame period and in 
response to said modes to produce said decided mode results and said codes as an encoder device output signal, 
characterised in that said deciding means decides said modes by using feature quantities of each current speech 
frame segmented from said input speech signal at said frame period and a previous speech frame segmented at 
least one frame period prior to said current speech frame. 

2. A speech signal encoder device as claimed in claim 1 , characterised in tiiat said deciding means (49) conprises: 

calculating means (51 , 53) for calculating a weighted sum of each current and a previous quantity extracted 
from said current and said previous speech frames as said feature quantities; and 

mode deciding means (57) for using said weighted sum in deciding said modes. 

3. A speech signal encoder device as claimed in claim 1 , further comprising: 

extracting means (49(B)) for using said current and said previous speech frames in extracting pitches from 
said input speech signal; 

wherein said deciding means (49) deciding said modes by using said pitches as said feature quantities. 

4. A speech signal encoder device as claimed in any one of claims 1 to 3, characterised in that each of said feature 
quantities is a pitch prediction gain of said current speech frame. 

5. A speech signal encoder device comprising segmenting means (31) for segmenting an input speech signal into 
original speech frames at a predetermined frame period, extracting means (49(A)) for using said original speech 
frames in extracting pitches from said input speech signal, and encoding means (65. 69, 73, 33) for encoding said 
input speech signal at said frame period and in response to said pitches into codes for use as an encoder device 
output signal, characterised in that said extracting means extracts said pitches by using each current speech frame 
segmented from said input speech signal at said frame period and a previous speech frame segmented at least 
one frame period prior to said current speech frama 
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6. A speech signal encoder device conriprising segmenting means (31) for segmenting an input speech signai into 
originai speech frames at a predetermined frame period, deciding means (49) for using said original speech frames 
in deciding a predetermined number of modes of said original speech frames to produce decided mode results, and 
encoding means (65, 69, 73, 33] for encoding said input speech signai Into codes at said frame period and in 
response to said codes to produce said decided mode results and said codes as an encoder device output signal, 
characterised in that said deciding means makes use, in deciding a current mode of said modes for each cun^ent 
speech frame segmented from said input speech signal at said frame period, of feature quantities of at least one 
kind extracted from said current speech frame and a previous speech frame segmented at least one frame period 
prior to said current speech frame arxJ of a previous mode decided at least one frame period prior to sakj current 
mode. 

7. A speech signal encoder device as claimed in daim 6. characterised in that said feature quantities are rates of var- 
iation with time in said feature quantities. 

8. A speech signal encoder device as claimed in claim 7, further comprising means (81 ) for extracting each of primary 
quantities of said feature quantities from said current speech frame, characterised in that said deciding means (49) 
comprises: 

means (83) for extracting said rates of variation from said current and said previous speech frames as sec- 
ondary quantities of said feature quantities; and 

mode deciding means (85. 87) for deciding said current mode in response to said primary and said second- 
ary quantities ad said previous mode. 

9. A speech signal encoder device as claimed in claim 8, characterised in that: 

said mode deciding means (85, 87) adjusts said current mode into an adjusted mode in response to said pri- 
mary and said secondary quantities and said previous mode; 

said encoding means (65, 69, 73, 33) using, as said modes, adjusted modes produced by said mode decid- 
ing means for said input speech signal. 

10. A speech signal encoder device as claimed in any one of claims 6 to 9, characterised in that each of said feature 
quantities is one of a pitch prediction gain, a shorti3eriod predicted gain, a level, and a pitch of said current speech 
frame. 

11. A speech signal encoder device comprising segmenting means (31) for segmenting an input speech signal Into 
original speech frames at a predetermined frame period, deciding means (49) for using said original speech frames 
in deciding a predetermined number of nx)des of said original speech frames to produce decided mode results, 
extracting means (101 . 103) for extracting pitches from said input speech signal, and encoding means (65, 69, 73, 
33) for encoding said input speech signal into codes at said frame period and in response to said modes to produce 
said decided mode results and said codes as an encoder device output signal, characterised in that: 

said extracting means comprises: 

feature quantity extracting means (105) for extracting feature quantities by using at least each cun-ent 
speech frame segmented from said input speech signal at said frame period; and 

feature quantity adjusting means (107, 109) for using said feature quantities as said pitches to adjust said 
pitches into adjusted pitches in response to each current mode decided for said current speech frame and a previ- 
ous mode decided at least one frame period prior to said cun^ent mode; 

said encoding means encoding said input speech signal into said codes in response further to said adjusted 
pitches. 

12. A speech signal encoder device as claimed in claim 1 1 , characterised in that said feature quantity extracting means 
(105) extracts said pitches in response to said current speech frame and rates of variation with time in said pitches 
in response to said current speech frame and a previous speech frame segmented at least one frame period prior 
to said current speech frame. 

13. A speech signal encoder device as claimed in daim 11 or 12, characterised in that each of said feature quantities 
is one of a pitch prediction gain, a short-period predicted gain, a level, and a pitch of said cun-ent speech frame. 

14. A speech signal encoder device comprising segmerrting means (31) for segmenting an input speech signal into 
original speech frames at a predetermined frame period, deciding means (49) for using said original speech frames 
in deciding a predetermined number of modes of said original speech frames to produce dedded mode results, 
extracting means (121) for extracting levels from said input speech signal, and encoding means (65, 69, 73, 33) for 
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encoding said Input speech signa! into codes at said frame period and in response to said modes to produce said 
decided mode results and said codes as an encoder device output signal, characterised in that: 
said extracting means comprises: 

feature quantity extracting means (125) for extracting feature quantities by using at least each cun-ent 
speech frame segmented from said input speech frame at said frame period; and 

feature quantity adjusting means (127, 129) for using said feature quantities as said levels to adjust said lev- 
els into adjusted levels in response to each cun-ent mode decided for said current speech frame and a previous 
mode decided at least one frame period prior to said cun-ent mode; 

said encoding means encoding said input speech signal into said codes in response further to said adjusted 

levels. 

1 5. A speech signal encoder device as claimed in claim 1 4, characterised in that said feature quantity extracting means 
(125) extracts said levels in response to said current speech frame and rates of variation with time in said levels in 
response to said cun*ent speech frame and a previous speech frame segmented at least one frame period prior to 
said current speech frame. 

16. A speech signal encoder device as claimed in claim 14 or 15, characterised in that each of said feature quantities 
is one of a pitch prediction gain, a short-period predicted gain, a level, and a pitch of said current speech frame. 

1 7- A speech signal encoder device as claimed in any one of claims 1 to 3. 5 to 9, 1 1 . 1 2, 1 4. and 1 5, further comprising 
weighting means (47) for perceptually weighting said original speech frames into weighted speech frames, charac- 
terised in that said deciding means (49) uses said weighted speech frames in deciding said modes. 
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